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Abstract 

Let C n be the origin-containing cluster in subcritical percolation on the lattice 3 
viewed as a random variable in the space O of compact, connected, origin-containing 
subsets of M d , endowed with the Hausdorff metric 5. When d > 2, and T is any open 
subset of f2, we prove that 

lim -log P(C n £ r) = - inf X(S) 
n— >oo n Ser 

where X(S) is the one-dimensional Hausdorff measure of S defined using the correlation 
norm: ^ 

\\u\ \ : = lim -- log P(u n £ C n ) 



where u n is u rounded to the nearest element of \^ d - Given points a 1 , . . . , a k £ M. d , 
there are finitely many correlation-norm Steiner trees spanning these points and the 
origin. We show that if the C n are each conditioned to contain the points a* , . . . , a* , 
then the probability that C n fails to approximate one of these trees tends to zero 
exponentially in n. 

1 Introduction 

Let C n be the origin- containing cluster in subcritical Bernoulli bond-percolation with pa- 
rameter p on the lattice -Z d ; we view C n as a random variable in the space f2 of compact, 
connected, origin- containing subsets of M. d . When the probability measure involved is clear 
from context, we use P(A) to denote the probability of an event A. When u £ lR d , let u n be 
the vector u rounded to the nearest element in -Z d . We define the "correlation norm" by 

\\u\\ :— lim iogP(u n £ C n ). 

n— >oo n 

This limit exists for all u £ IR d (with \ \u\ \ £ (0, oo) for u ^ 0) and 1 1 • 1 1 is a strictly convex norm 
(i.e., if u and v are not on the same line through the origin, then | \u + 1>|| < | \u\ \ + | \v\ |) that 
is real-analytic on the Euclidean unit sphere S^ -1 Denote by X(S) the one- dimensional 
Hausdorff measure of the set S denned with the above norm; in particular, if S £ Q is a 



1 



finite union of rectifiable arcs in M. d , then X(S) is the sum of the correlation-norm lengths of 
those arcs. 

Given a set X C M d , denote by B 6 (X) the set of all points of distance less than e from 
some point in X. Given sets X, Y G Q, let S(X, Y) be their Hausdorff distance, i.e., 

5(X, Y) = inf{e : X C S e (Y), Y C S e (X)}. 

Many authors, including pQ, jS|, [3] and [12] . have investigated the shapes of "typical" 
large finite clusters in supercritical percolation on Z d by proving surface order large deviation 
principles for clusters conditioned to contain at least m vertices. They have shown that as 
m gets large, the shapes of typical clusters are approximately minimizers of surfaces tension 
integrals, called Wulff crystals. Moreover, the surface tension integral is a rate function for a 
large deviation principle — with surface order speed m d ~ 1 ^ d — on cluster shapes. These results 
are one way of precisely answering the questions, "What does the typical 'large' cluster look 
like? How unlikely are large deviations from this typical shape?" 

If instead of number of vertices we define "large" in terms of, say, diameter or volume of 
the convex hull, then these questions can be answered for subcritical percolation using the 
following linear speed large deviation principle: 

Theorem 1.1. Let d > 2, p < p c , and F C Q be Borel-measurable. Then 

- inf X(S) < liminf-logP(C n G T) < limsup - logP(C n G T) < - inf X(S) 

ser° n^oo n n -*oo n ser 

where T° and F are the interior and closure of F with respect to the Hausdorff topology. 

In the language of t 5j, this says that the random variables C n satisfy a large deviation 
principle with respect to the Hausdorff metric topology on Q and with speed n and rate 
function I(S) = X(S). Note that since A : Q — > R is continuous, this implies that 

lim- log P(C n G r) = — inf X(S) 
n ser 

whenever F is an open subset of Q. 

Acknowledgments. We thank Amir Dembo and his probability discussion group for helpful 
conversations and thank Yuval Peres for some suggestions on the presentation. Also, Raphael 
Cerf has informed us that Olivier Couronne, working independently, produced an alternate 
proof of Theorem 11.11 and was nearly finished writing up the result at the time that our 
paper was submitted and posted to the arXiv [3]. 

2 Proof of large deviation principle 

2.1 Exponential tightness and an equivalent formulation 

We now prove Theorem ll.il The sets {S , |5(S f , {0}) < a} are compact in the Hausdorff metric 
topology, and P(8(C n , {0}) > a) decays exponentially in n and a. [TT] This implies that the 
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laws of the C n are exponentially tight (in the sense of [3], Sec. 1.2). Given this exponential 
tightness, Theorem 11.11 is equivalent to the statement that the following bounds hold for 
S G SI: 

UmA{S,e) < X{S) 
limB^e) > X(S) 

where 

A(S, e) = limsup — log P(6(S, C„) < e) 

n^oo Tl 

B{S, e) = liminf — log P(S(S, C n ) < e) 

n^oo fl 

This equivalence is well-known in the large deviations literature (j5J, Lemma 1.2.18 and 
Theorem 4.1.11), and is also not hard to prove directly. We now prove the first of the two 
bounds above, which involves giving a lower bound on the probabilities P(8(S, C n ) < e). 

2.2 Lower bound on probabilities 

Fix e and choose S' to be a connected union of finitely many line segments of the form 
(a*,6 J ), for 1 < i < k — intersecting one another only at endpoints — such that S(S,S') < e/2 
and at least one of the segments includes the origin as an endpoint. No matter how small e 
gets, we can always choose such an S' of total length less than or equal to X(S). Thus, it is 
enough to show that 

liminf — log P(S(S',C n ) < e/2) < X(S') 
n 

for sets S' of this form. 

Now, let A l n (respectively, A l n c be the event that a l n and b l n are connected by some open 
path whose Hausdorff distance from the line segment (a 1 , b 1 ) is at most e/4 (respectively c/n). 
For any fixed n, P(A l n c ) tends to P(a l n — b l n E C n ) as c tends to oo. Subadditivity arguments 
imply that liminf — log P(A l nc ) tends to \\a l — b l \\ as c tends to infinity. It follows that 
liminf — log P(A z n ) < \ \a l — b l \\. The FKG inequality then implies that liminf — P(UA l n ) < 

x(s'). n 

Now, we have to show that given UA l n , the probability of the event C n (jL B e / 2 (S') decays 
exponentially. Let D n be the event that there is a path from any point x outside of B e /2(S') 
to any point y G B e u(S'). This event is independent of UA l n . Since D n contains the event 
C n <f. B t / 2 (S f ), it is enough for us to show that P(D n ) decays exponentially. To see this, we 
introduce and sketch a proof of the following lemma. (See [H] for more delicate asymptotics 
oiP(u n eC n ).) 

Lemma 2.1. There exists a constant a such that P(u n E C n ) < ae _?1 " u " for all n and u. 

Proof. If u = u n , then it is clear that P(u n G C n ) < e~ n " u ". (Simply use the FKG inequality 
to observe that for any integer m, we have P{u mn G C mn ) > P{u n G C n ) m and apply the 
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standard subadditivity argument to the log limits.) If u ^ u n , then it suffices to observe 
that e _ra " n " and e _ra " Un " differ by at most a constant factor. □ 

The probability that any particular vertex of B e (S')\B e / 2 (S') is connected to any partic- 
ular vertex in 5 e / 4 (5') is bounded above by a exp[— n inf {| \u\ \ : \u\ = e/4}], where \u\ is the 
Euclidean norm. Since the number of pairs of points of this type grows polynomially in n, 
the result follows. 

2.3 Upper bound on probabilities 

Fix 7 > and choose a finite set of points a 1 , a 2 , . . . a k in S such that every collection S' of 
line segments that contains the points a 1 has total length greater than X(S) — 7 (or greater 
than some large value N if X(S) is infinite) and that for some sufficiently small e > 0, this 
remains true if each a 1 is replaced by some c l G B € (a l ). (The reader may check that such a 
set of points and such an e exist for any 7 > 0.) We know that 

limsup--logP(5(S,C n ) < e) 

is at least as large as 

limsup logP(some c % G BJa 1 ) is contained in C n ). 

n 

We claim that the latter is at least X(S) —7. If C n does contain all of the c^, then it must 
contain a subgraph that is a tree with the c l n as vertices. If we remove all branches of this 
tree that do not contain a c % n , then a straightforward induction on k shows that we are left 
with a tree T in which at most k — 2 vertices have more than two neighbors. Denote by b\ 
the vertices with this property. The path-connectedness-in-T relation puts a tree structure 
on the set of b l n and c l n . Each edge of this new tree T' represents a pair of these points joined 
by a path, and all of these paths are disjoint. 

Now, given a specific set of set of points b l n and c l n and T", we have by the BK inequality 
and Lemma 12.11 that the probability that these disjoint paths are contained in C n is at 
most ae~ x ( T '^ n , where A(T') is the sum of the correlation lengths of the edges of T', and by 
assumption this value is at least X(S) — 7. Since the number of possible choices for the b l n 
and the c l n grows polynomially, and since 7 can be chosen arbitrarily small, this completes 
the proof. 

2.4 Steiner trees 

Given points (correlation norm) Steiner tree spanning {a 1 } and the origin is an 

element T of Q for which A(T) is minimal among sets containing the {a*}. Existence of at 
least one Steiner tree follows from compactness arguments, and Steiner trees are trees with 
at most k — 2 vertices in addition to a 1 , . . . ,a k jS]. Although the Steiner tree spanning a 
set of points is not always unique, strict convexity of the correlation norm implies that the 
number of Steiner trees is always finite. See ^0] for a general reference on Steiner trees. The 
proof of Theorem 11.11 now yields the following: 
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Theorem 2.2. Let d > 2 and let T C f2 6e Borel-measurable. If C n is the origin- containing 
cluster in a subcritical percolation conditioned on {a l n } C C„, £/iere 

- inf A(S)-A(T) < liminf- log P(C n G T) < limsup - log P(C n G T) < - inf X(S) — X(T) 
ser° n^oo n n ^oo n s& 

where T is any Steiner tree spanning {a 1 } and the origin. 

In other words, these conditioned C n satisfy a large deviation principle with rate function 
given by I{S) = X(S) — X(T). In particular, if Tj, for 1 < j < m, are the Steiner trees 
spanning {a 1 } and the origin, and B £ (Tj) = {S : 5(S,Tj) < e}, then we have 

lim - log P(C n UBJTA = - inf X(S) - X(T). 

That is, the probability that C n fails to approximate one of these Steiner trees tends to zero 
exponentially. 
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